Structural maximum a posteriori linear regression for fast HMM adaptation

نویسندگان

  • Olivier Siohan
  • Tor André Myrvoll
  • Chin-Hui Lee
چکیده

Transformation-based model adaptation techniques like maximum likelihood linear regression (MLLR) rely on an accurate selection of the number of transformations for a given amount of adaptation data. If too many transformations are used, the transformation parameters may be poorly estimated, can overfit the adaptation data, and offer poor generalization. On the other hand, if the number of transformations is too small, the adapted models can only provide a moderate improvement over the baseline models. An adaptation approach should therefore be flexible in order to estimate reliably a large number of transformations when the amount of adaptation data is large, and a small number of transformations when only a few adaptation utterances are available. In this work, we show that a significant improvement can be obtained over MLLR with dynamic regression classes, first by replacing the maximum likelihood estimation criterion by a maximum a posteriori criterion, then by introducing a tree-structure for the prior densities of the transformations. The effectiveness of the proposed approach is illustrated on the Spoke3 1993 test set of the WSJ task. Using the same regression classes as MLLR, it is shown that the proposed approach reduces the risk of overfitting and exploit the adaptation data much more efficiently than MLLR, leading to a significant reduction of the word error rate with as little as one adaptation utterance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

HMM adaptation for child speech synthesis

Hidden Markov Model (HMM)-based synthesis in combination with speaker adaptation has proven to be an approach that is well-suited for child speech synthesis [1]. This paper describes the development and evaluation of different HMM-based child speech synthesis systems. The aim is to determine the most suitable combination of initial model and speaker adaptation techniques to synthesize child spe...

متن کامل

Mean and covariance adaptation based on minimum classification error linear regression for continuous density HMMs

The performance of speech recognition system will be significantly deteriorated because of the mismatches between training and testing conditions. This paper addresses the problem and proposes an algorithm to adapt the mean and covariance of HMM simultaneously within the minimum classification error linear regression (MCELR) framework. Rather than estimating the transformation parameters using ...

متن کامل

Constrained Structural Maximum A for Average-Voice-Based

This paper proposes a constrained structural maximum a posteriori linear regression (CSMAPLR) algorithm for further improvement of speaker adaptation performance in HMM-based speech synthesis. In the algorithm, the concept of structural maximum a posteriori (SMAP) adaptation is applied to estimation of transformation matrices of the constrained MLLR (CMLLR), where recursive MAP-based estimation...

متن کامل

Feature space maximum a posteriori linear regression for adaptation of deep neural networks

We propose a feature space maximum a posteriori (MAP) linear regression framework to adapt parameters for context dependent deep neural network hidden Markov models (CD-DNNHMMs). Due to the huge amount of parameters used in DNN acoustic models in large vocabulary continuous speech recognition, the problem of over-fitting can be severe in DNN adaptation, thus often impair the robustness of the a...

متن کامل

Quasi-Bayes linear regression for sequential learning of hidden Markov models

This paper presents an online/sequential linear regression adaptation framework for hidden Markov model (HMM) based speech recognition. Our attempt is to sequentially improve speaker-independent speech recognition system to handle the nonstationary environments via the linear regression adaptation of HMMs. A quasi-Bayes linear regression (QBLR) algorithm is developed to execute the sequential a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computer Speech & Language

دوره 16  شماره 

صفحات  -

تاریخ انتشار 2002